* * Soc 229A: Event History Analysis * * Assignment #1 do file * -- Annotated, to help you learn Stata * I put the following commands at the top of every do-file: * Specify which version of stata you are using * This ensures compatibility if you switch to a newer version down the road... version 10 * close the log, in case we accidentally left it open before capture log close * Create a log file which will contain the text output (won't include graphs) log using "C:\Users\schofer\Documents\Classes\2008 Soc 229 EHA\Examples\GSSex.log",replace * Start by clearning memory clear * Now, ask Stata to set aside enough memory for a big dataset set memory 100m * Load the dataset with the "use" command * You will have to change this to reflect the location of the * data file on your computer use "C:\Users\schofer\Documents\Data\GSS\GSS2006subset.dta" * Get rid of cases with missing data on critical variables * Note: Stata uses a period (.) to indicate missing data * Note: Stata requires 2 equals signs in an "If" expression... drop if AGE == . * Create end-state variable * 1 for people with kids (AGE-KID-BORN is not missing) * zero for people without kids (AGE-KID-BORN is missing) * gen endstate=0 * Note: ~= and != can be used to indicate "not equal" replace endstate = 1 if AGEKDBRN ~=. * Create end-spell variable * Ends at time of childbirth for people with kids * ends at current age (time of censoring) for people without kids gen endtime = AGE replace endtime = AGEKDBRN if endstate == 1 * Generate a variable of interest gen pre1960cohort = 0 replace pre1960cohort=1 if COHORT < 1960 * The "stset" command prepares the data for event history models * It tells Stata which variable has end-of-spell information * and how to know if the event ("failure") has occured stset endtime, failure(endstate==1) * Make a survivor plot sts graph, surv *Make a survivor plot, broken out by groups defined by a dummy variable sts graph, surv by(pre1960cohort) * Make a smoothed hazard plot, broken out by groups sts graph, haz by(pre1960cohort) * Make an integrated (cumulative) hazard plot sts graph, cumhaz sts graph, cumhaz by(pre1960cohort) * I put these commands at the end of every Stata file: * close the log log close * all done! exit